Robust i-vector based adaptation of DNN acoustic model for speech recognition

نویسندگان

Sri Garimella

Arindam Mandal

Nikko Strom

Björn Hoffmeister

Spyridon Matsoukas

Sree Hari Krishnan Parthasarathi

چکیده

In the past, conventional i-vectors based on a Universal Background Model (UBM) have been successfully used as input features to adapt a Deep Neural Network (DNN) Acoustic Model (AM) for Automatic Speech Recognition (ASR). In contrast, this paper introduces Hidden Markov Model (HMM) based ivectors that use HMM state alignment information from an ASR system for estimating i-vectors. Further, we propose passing these HMM based i-vectors though an explicit non-linear hidden layer of a DNN before combining them with standard acoustic features, such as log filter bank energies (LFBEs). To improve robustness to mismatched adaptation data, we also propose estimating i-vectors in a causal fashion for training the DNN, restricting the connectivity among hidden nodes in the DNN and applying a max-pool non-linearity at selected hidden nodes. In our experiments, these techniques yield about 5-7% relative word error rate (WER) improvement over the baseline speaker independent system in matched condition, and a substantial WER reduction for mismatched adaptation data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A study on deep neural network acoustic model adaptation for robust far-field speech recognition

Even though deep neural network acoustic models provide an increased degree of robustness in automatic speech recognition, there is still a large performance drop in the task of far-field speech recognition in reverberant and noisy environments. In this study, we explore DNN adaptation techniques to achieve improved robustness to environmental mismatch for far-field speech recognition. In contr...

متن کامل

Optimizing DNN Adaptation for Recognition of Enhanced Speech

Speech enhancement directly using deep neural network (DNN) is of major interest due to the capability of DNN to tangibly reduce the impact of noisy conditions in speech recognition tasks. Similarly, DNN based acoustic model adaptation to new environmental conditions is another challenging topic. In this paper we present an analysis of acoustic model adaptation in presence of a disjoint speech ...

متن کامل

Uncertainty Decoding with Adaptive Sampling for Noise Robust DNN-Based Acoustic Modeling

Although deep neural network (DNN) based acoustic models have obtained remarkable results, the automatic speech recognition (ASR) performance still remains low in noise and reverberant conditions. To address this issue, a speech enhancement front-end is often used before recognition to reduce noise. However, the front-end cannot fully suppress noise and often introduces artifacts that are limit...

متن کامل

Incorporating a Generative Front-End Layer to Deep Neural Network for Noise Robust Automatic Speech Recognition

It is difficult to apply well-formulated model-based noise adaptation approaches to Deep Neural Network (DNN) due to the lack of interpretability of the model parameters. In this paper, we propose incorporating a generative front-end layer (GFL), which is parameterised by Gaussian Mixture Model (GMM), into the DNN. A GFL can be easily adapted to different noise conditions by applying the model-...

متن کامل

Speaker adaptation of DNN-based ASR with i-vectors: does it actually adapt models to speakers?

Deep neural networks (DNN) are currently very successful for acoustic modeling in ASR systems. One of the main challenges with DNNs is unsupervised speaker adaptation from an initial speaker clustering, because DNNs have a very large number of parameters. Recently, a method has been proposed to adapt DNNs to speakers by combining speaker-specific information (in the form of i-vectors computed a...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Robust i-vector based adaptation of DNN acoustic model for speech recognition

نویسندگان

چکیده

منابع مشابه

A study on deep neural network acoustic model adaptation for robust far-field speech recognition

Optimizing DNN Adaptation for Recognition of Enhanced Speech

Uncertainty Decoding with Adaptive Sampling for Noise Robust DNN-Based Acoustic Modeling

Incorporating a Generative Front-End Layer to Deep Neural Network for Noise Robust Automatic Speech Recognition

Speaker adaptation of DNN-based ASR with i-vectors: does it actually adapt models to speakers?

عنوان ژورنال:

اشتراک گذاری